【沒錢ps,我用OpenCV!】Day 24 - 綜合運用3，(應用app) 用 OpenCV 來製作一個照片文件掃描機吧! photo scanner 透視投影

12th鐵人賽

嗡嗡

2020-10-06 18:06:47

2156 瀏覽

分享至

先來看看今天的結果gif

圖片說明

-> 此篇文章的程式碼 github

Day24_自製文件掃描機_photo_scanner.ipynb

前言

我們該來運用之前學過的所有東西了!
綜合運用篇就是來一次運用前面的所學!

(應用app) 用 OpenCV 來製作一個照片文件掃描機吧!

我們要製作一個照片文件掃描機，邏輯上大致要完成以下步驟：

主程式 (讀取圖片)
運用昨天的內容，製作一個能讀取座標點的滑鼠控制畫面 (滑鼠處理)
利用滑鼠控制回傳的座標進行透視投影運算

主程式 (讀取圖片)

#Read the destination image
ori_img = cv2.imread("./testdata/paper.jpg")
print("origin image: ")
show_img(ori_img)

print("Click on four corners of bllboard and the press ENTER")
points = get_points(ori_img)
pts1 = np.float32(points)

這部分就單純的讀取圖片，
我們使用我們定義的 get_points 來幫助我們找到輪廓的四個座標點。

(注意：請從左上角，依照順時針順序，在圖片的四個角落點四個點)

運用昨天的內容，製作一個能讀取座標點的滑鼠控制畫面 (滑鼠處理)

def mouse_handler(event, x, y, flags, data):
    if event == cv2.EVENT_LBUTTONDOWN:
        # 標記點位置
        cv2.circle(data['img'], (x,y), 30, (0,0,255), -1) 

        # 改變顯示 window 的內容
        cv2.imshow("Image", data['img'])
        
        # 顯示 (x,y) 並儲存到 list中
        print("get points: (x, y) = ({}, {})".format(x, y))
        data['points'].append((x,y))

def get_points(img):
    # 建立 data dict, img:存放圖片, points:存放點
    data = {}
    data['img'] = img.copy()
    data['points'] = []
    
    # 建立一個 window
    cv2.namedWindow("Image", 0)
    
    # 改變 window 成為適當圖片大小
    h, w, dim = img.shape
    print("img height, width: ({}, {})".format(h, w))
    
    cv2.namedWindow("Image", cv2.WINDOW_AUTOSIZE) # cv2.WINDOW_NORMAL)
        
    # 顯示圖片在 window 中
    cv2.imshow('Image',img)
    
    # 利用滑鼠回傳值，資料皆保存於 data dict中
    cv2.setMouseCallback("Image", mouse_handler, data)
    
    # 等待關閉視窗，藉由 OpenCV 內建函數釋放資源
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    
    # 回傳點 list
    return data['points']

這部分都與昨日的成品相同，可參考昨天的描述。

利用滑鼠控制回傳的座標進行透視投影運算

target_height = 842 # A4
target_width = 595 # A4
pts2 = np.float32([[0,0],[target_width,0],[target_width,target_height],[0,target_height]])

# 計算最佳變形矩陣
M = cv2.getPerspectiveTransform(pts1, pts2)

# 將原圖使用變形矩陣做透視變換
res = cv2.warpPerspective(ori_img, M, (target_width, target_height))

print("photo scanner result: ")
show_img(res, bigger=True)

print("Doing threshold: ")
resgray = cv2.cvtColor(res, cv2.COLOR_BGR2GRAY) # 先將圖片轉為灰階    
# Otsu's thresholding
ret, thresh = cv2.threshold(resgray,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
show_img(thresh, bigger=True)

# 顯示圖片在 window 中
cv2.imshow('Scanner',res)
cv2.imshow('Threshold',thresh)

cv2.waitKey(0)
cv2.destroyAllWindows()

我們先定義好文件的大小 (這裡是 A4紙的大小(842x595))，

我們使用 cv2.getPerspectiveTransform，計算最佳變形矩陣，
我們可透過這行的結果將原圖片透過矩陣運算做最適當的變形。

我們運用上面算出的矩陣搭配 cv2.warpPerspective，
將原圖進行透視變換 (上一行只有計算最佳變形矩陣，這一行才有變形)。
算出來的就是我們要的結果囉~~~

最後我們還能再自己使用 Otsu's thresholding 做二值化，
將文件轉為黑白文件，就像印表機印出來的結果一樣呢！

~~(自己做就不用載需要付費的app囉)~~